Leveraging Site Search Logs to Identify Missing Content on Enterprise Webpages

نویسندگان

  • Harsh Jhamtani
  • Rishiraj Saha Roy
  • Niyati Chhaya
  • Eric Nyberg
چکیده

Pearson’s rank correlation coefficient (r) between the vectors of counts and residual values over all tuples was found to be very close to zero (−0.035). Kendall rank correlation coefficient τ between the ranked lists when (w, q) tuples are ordered by frequency and residual value, was found to be −4.65 × 10−9 . This indicates almost no correlation between counts and residuals. Problem Statement

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Preparation for Web Mining – A survey

An accepted trend is to categorize web mining into three main areas: web content mining, web structure mining and web usage mining. Web content mining involves extracting details/information from the contents of webpages and performing things like knowledge synthesis. Web structure mining involves the usage of graph theory to understand website structure/hierarchy. Web usage mining involves the...

متن کامل

A Novel Architecture for Detecting Phishing Webpages using Cost-based Feature Selection

Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that ...

متن کامل

Long-Term Learning for Web Search Engines

This paper considers how web search engines can learn from the successful searches recorded in their user logs. Document Transformation is a feasible approach that uses these logs to improve document representations. Existing test collections do not allow an adequate investigation of Document Transformation, but we show how a rigorous evaluation of this method can be carried out using the refer...

متن کامل

Using the Results of CPTu to Identify the Subsurface Sediment Layers in Urmia Lake Bridge Site, NW Iran

Specifying the soil types and profiling the subsurface soil layers are the excellent examples of CPTu test potentials. In this research, the capability of CPTu test for specifying subsurface soil layers and classification of sediments in Urmia Lake is investigated. According to previous studies, the sediments of Urmia Lake are commonly fine grained and soft deposits with organic materials. To e...

متن کامل

Keyword Extraction for Webpage Clusters

The volume of unstructured information presented on the Internet is constantly increasing, together with the total amount of websites and their contents. To process this vast amount of information it is important to distinguish different clusters of related webpages. Such clusters are used, for example, for template induction, keyword extraction, and recommendation algorithms. A variety of appl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017